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Applications of Bayesian Decision Theory to Sequential Mastery Testing 



Hans J. Vos 



Abstract 



The purpose of this paper is to formulate optimal sequential rules for mastery tests. The 
framework for the approach is derived from empirical Bayesian decision theory. Both a 
threshold and linear loss structure are considered. The binomial probability distribution is 
adopted as the psychometric model involved. Conditions sufficient for sequentially setting 
optimal cutting scores are presented. Optimal sequential rules will be derived for the case of a 
beta distribution representing prior true level of functioning. An empirical example of 
sequential mastery testing for concept-learning in medicine concludes the paper. 
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Applications of Bayesian Decision Theory to Sequential Mastery Testing 



In a fixed-length mastery test, the decision is to classify students as either a master or a 
nonmaster. During the last two decades, the fixed-length mastery problem has been studied 
extensively by many researchers (e.g., Cronbach & Gleser, 1965; Davis et al., 1973; De 
Gruijter & Hambleton, 1984; Hambleton & Novick, 1973; Huynh, 1976, 1980; Swaminathan 
et al., 1975; van der Linden, 1980, 1990; van der Linden & Mellenbergh, 1977; Wilcox, 
1977). Most of these authors derived, analytically or numerically, optimal rules by applying 
(empirical) Bayesian decision theory (e.g., DeGroot, 1970; Lehmann, 1959; Lindgren, 1976) 
to this problem. The application of (empirical) Bayesian methods to decision making consists 
of two basic elements: A psychometric model relating observed test scores and student’s true 
level of functioning to each other, and a loss structure evaluating the total costs and benefits of 
all possible decision outcomes. Optimal rules are derived by minimizing the posterior 
expected loss. 

Beside the fixed-length mastery problem, attention has also been paid to the variable- 
length mastery problem. In this type of problem the decision is to classify students as a master, 
a nonmaster, or present another item. The main goal of a variable-length mastery test is to 
provide shorter tests for students who have clearly attained a certain level of mastery (or 
clearly nonmastery) and longer tests for those students for whom the mastery decision is not 
as clear-cut (Lewis & Sheehan, 1990). In case the items are randomly selected, the variable- 
length mastery problem is also known as a sequential or multistage mastery problem. If a 
computer is used for administering and scoring the items (e.g., Lewis & Sheehan, 1990; 
Sheehan & Lewis, 1992), and the optimal rule is determined using sequential decision theory, 
the mastery test is called a computerized mastery test (CMT). 

One of the earliest sequential mastery tests was designed by Ferguson (1969a, 1969b) 
using Wald's sequential probability ratio test (SPRT). In Ferguson’s approach, students' 
responses to items are assumed to follow a binomial probability distribution. The binomial 
model assumes that, given the true level of functioning, the probability to answer the item 
correctly is equal for all items in the pool, or that items are sampled at random. Using item 
response theory (IRT) models, Reckase (1983) and Kingsbury and Weiss (1983) proposed 
alternative sequential mastery testing procedures within an SPRT-framework. In both 
procedures, as opposed to Ferguson’s approach, items are not assumed to have equal difficulty 
but are allowed to vary in difficulty and discrimination. In addition, the next item to be 
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presented to the student is not selected randomly but is based on the principle of maximizing 
the amount of information. Hence, the item selection procedures proposed by Reckase (1983) 
and Kingsbury and Weiss (1983) are adaptive instead of random (also see Spray & Reckase, 
1996). 

In the Lewis and Sheehan (1990) model, Bayesian theory is used to determine the 
optimal sequential number of equivalent testlets (i.e., short blocks of parallel items) to be 
randomly administered to the student. As in Reckase (1983) and Kingsbury and Weiss (1983), 
the conditional probability of a correct response, given the true level of functioning, is 
modeled using IRT. A threshold loss function is assumed from which the posterior expected 
losses associated with the mastery and nonmastery decisions can be calculated at each stage of 
sampling. The posterior expected loss associated with continuing sampling is determined 
considering all possible decision outcomes of future randomly presented items by backward 
induction. The optimal sequential decision rule is now found by selecting the action (i.e., 
mastery, nonmastery, or to continue sampling) that minimizes posterior expected loss at each 
stage of sampling. Doing so, as indicated by Lewis and Sheehan (1990), the action selected at 
each stage of sampling is optimal with respect to the entire sequential mastery testing 
procedure. 

The purpose of the present paper is to derive optimal rules for sequential mastery tests. 
As in the Lewis and Sheehan model, optimal sequential rules are determined using Bayesian 
decision theory. Our approach differs from Lewis and Sheehan, however, in the following five 
respects. First, as in Ferguson’s approach, for the conditional probability of a correct response 
given the true level of functioning (i.e., the psychometric model), the binomial instead of an 
IRT model is considered. Two, in addition to threshold loss, optimal sequential rules are also 
derived for linear loss. Three, conditions sufficient for sequentially setting optimal cutting 
scores are presented. Four, optimal sequential rules will be derived when prior true level of 
functioning is determined through an analysis of empirical data (i.e., empirical Bayesian 
approach) instead of through a subjective assessment. It will be assumed in the present paper 
that prior true level of functioning can be characterized by a beta distribution, which will be 
examined against the data in the empirical example. Finally, four instead of three possible 
actions are distinguished, namely the action administering one more randomly selected item 
and the three classification actions mastery, partial mastery, and nonmastery. The paper 
concludes with an empirical example of a computerized mastery test for concept-learning in 
medicine. 
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The Sequential Four-Action Mastery Problem 

In the following, a sequential four-action mastery test is supposed to have a maximum length 
of n (n > 1). Following Ferguson (1969a, 1969b), a maximum test length is specified for those 
students for whom it is very difficult to classify them as a master, partial master, or 
nonmaster. Let the observed item response at each stage of sampling k (1 < k < n) for a 
randomly sampled student be denoted by a discrete random variable X k , with realization x k . 
The observed response variable X k takes the value 0 for a correct response and 1 for an 
incorrect response to the kth item. The variables X p ...,X k are assumed to be independent and 
identically distributed for each value of k (1 < k < n). Let S k = X, +...+ X k (1 < k < n) be the 
observed number-correct score variable, with realization s k = x, +...+ x k . (0 < s k < k). 
Furthermore, due to measurement and sampling error, the sequential four-action mastery test 
is assumed not to be a perfect indicator of student’s true performance. Therefore, let student's 
true level of functioning t € [0,1] at each stage of sampling k (1 < k < n) be denoted by a 
continuous random variable T. 

Suppose X, = x,,...,X k = x k has been observed. Then the two fundamental elements of 
the application of Bayesian methods to sequential decision making discussed earlier can be 
formulated as follows: A loss function describing the loss l(aj(x,,...,x k ),t) incurred when action 
a i (x, 1 ...,x k ) is taken for the student whose true level of functioning is t, and a psychometric 
model relating observed number-correct score s k to student's true level of functioning t at each 
stage of sampling k (1 < k < n). In fact, it is the unreliability of the test that opens the 
possibility of applying (sequential) Bayesian methods to the problem of determining the 
optimal number of items (Hambleton & Novick, 1973). 

In the sequential four-action mastery problem, given X, = x,,...,X k = x k , the following 
four actions are available to the decision-maker at each stage of sampling k (1 < k < n): First, 
declare nonmastery to a student, a,(xj,...,x k ), if his/her number-correct score s k is equal to or 
below a certain cutting score s cl (k) on the observed number-correct score scale S k . Second, 
declare partial mastery to a student, a 2 (x,,...,x k ), if his/her number-correct score s k exceeds 
s c ,(k) but is below a certain cutting score s c2 (k) on S k , where s cl (k) < s c2 (k). Three, declare 
mastery to a student, a 3 (x,,...,x k ), if his/her number-correct score s k is equal to or exceeds 
s c2 (k). Fourth, continue sampling, a 4 (x,,...,x k ), if the posterior expected loss associated with 
administering one more random item is minimal. For the final stage of sampling, n, only the 
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three classification actions nonmastery, partial mastery, and mastery are available to the 
decision-maker. 

It is important to notice that, linking up with common practice in criterion -referenced 
testing, the optimal sequential rules w.r.t. the three mastery classification decisions are 
assumed to have monotone forms; that is, rules in the form of cutting scores s cl (k) and s c2 (k). 
Conditions sufficient for optimal sequential rules to be monotone are given later on. 

Let the criteria levels t cl and t c2 (0 < t cl < t c2 < 1) represent the highest and lowest true 
level of functioning at which a student will be considered a true nonmaster and a true master, 
respectively. Furthermore, a student will be considered a partial true master if his/her true 
level of functioning exceeds t cl but is below t c2 . The two criteria levels t cl and t c2 must be 
specified in advance by the decision-maker (e.g., Angoff, 1971; Ebel, 1972; Nedelsky, 1954). 
Given the values of the criteria levels t cl and t c2 on T, the sequential four- action mastery 
problem can now be stated at each stage of sampling k (1 < k < n) as choosing values of s cl (k) 
and s c2 (k) or continue sampling such that the posterior expected loss is minimal. For the final 
stage of sampling, n, our sequential mastery problem reduces to choosing values of s cI (n) and 
s c2 (n) such that the posterior expected loss is minimal. 

Loss Structure 

Generally speaking, a loss function evaluates the total costs and benefits of all possible 
decision outcomes for a student whose true level of functioning is t. These costs may concern 
all relevant psychological, social, and economic consequences which the decision brings along 
(e.g., extra computer time associated with presenting randomly additional items). The 
Bayesian approach allows the decision-maker to incorporate into the decision process the 
costs of misclassifications (i.e., students for whom the wrong decision is made). 

In this section, as in the Lewis and Sheehan model, first the well-known threshold loss 
function will be discussed. Next, it is argued that in many situations the linear loss structure is 
a more realistic representation of the losses actually incurred. 

Threshold Loss 

The choice of this function implies that the costs and benefits involved can be summarized by 
possibly different constants for each possible decision outcome. Although this function may 
less realistic in some applications, it has been studied extensively in the psychometric 
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literature, in particularly in the (sequential) mastery testing literature (e.g., Ben-Shakhar & 
Beller, 1983; Chuang et al., 1981; Davis et al. 1973; Hambleton & Novick, 1973; Huynh, 
1976; Lewis & Sheehan, 1990; Novick & Lewis, 1974; Raju et al, 1991; Swaminathan et al., 
1975). 

Following Lewis and Sheehan (1990), a threshold loss function for our sequential 
mastery problem can be formulated as a natural extension of the one for the standard fixed- 
length two-action problem at each stage of sampling k ( 1 < k < n) as follows: 

Table 1 . Table for threshold loss function at stage k ( 1 < k < n) of sampling . 



^\7Yue Level 








Action 


T<tc 


t c l < T < tc2 


H 

IV 

n 


ai(xi, ...,x k ) 


ke 


1 12 + ke 


1 13 + ke 


a 2 (xi, .... x k ) 


I21 + ke 


ke 


I23 + ke 


a 3 (xi, .... x k ) 


I31 + ke 


I32 + ke 


ke 



Just as in the Lewis and Sheehan model, the value e represents the costs of 
administering one random item. For the sake of simplicity, again following Lewis and 
Sheehan, the costs of administering one random item are assumed to be equal for each 
decision outcome as well as for each sampling occasion. Of course, these two assumptions 
can be relaxed in specific sequential mastery testing applications. 

When optimizing the decision rule, a loss function needs to be determined only up to a 
positive multiplicative constant and an additive constant (e.g.. Luce & Raiffa, 1957). 
Therefore, assuming the losses l n , 1 22 , and 1 33 associated with the correct decision outcomes 
are equal and take the smallest values, the threshold loss function in Table 1 was rescaled in 
such a way that l n , 1 22 , and 1 33 were equal to zero. Consequently, the rescaled losses 1- (i,j = 
1,2,3; i * j) associated with the incorrect decisions must take positive values. 

Furthermore, it follows immediately from the way actions aj(x,,...,x k ) (i = 1,2,3) were 
defined that action a,(x lv ..,x k ) is most appropriate when t is small, whereas action a 2 (x,,...,x k ) 
is most appropriate when t takes intermediate values, and action a 3 (Xj,...,x k ) is most 
appropriate when t is large. As a result, the loss functions associated with actions aj(x,,...,x k ) 
and a 3 (x 1 ,...,x k ) must be nondecreasing and nonincreasing in t, respectively. As far as the loss 
function associated with action a 2 (x!,...,x k ) (i.e., partial mastery) is concerned, it cannot be 
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determined beforehand whether the loss for a student whose true level of functioning is below 
t cl (i.e., l 2 i) is equal to, larger than, or smaller than the loss for a student whose true level of 
functioning exceeds t c2 (i.e., 1 23 ). We only know that the loss associated with the correct 
partial mastery decision, 1 22 , must be smallest. 

The loss parameters ly (i = 1,2,3; i * j) have to be empirically assessed. For assessing 
loss functions empirically, most texts on decision theory propose lottery methods (e.g., Luce 
& Raiffa, 1957, Chap.2). In general, these methods use the notions of desirability of outcomes 
to scale the consequences of each pair of actions and true level of functioning. It may be noted 
that, in addition to lottery methods, other psychological scaling methods can be used for 
assessing empirically loss parameters as well. For instance, van der Gaag, Mellenbergh, and 
van den Brink (1988), van der Gaag (1990), and Vrijhof, Mellenbergh, and van den Brink 
(1983) empirically assessed loss functions using Bechtel’s preference method (Bechtel, 1976) 
and Comrey's constant sum method (Torgerson, 1958). 

Linear Loss 

An obvious disadvantage of the threshold loss function is that it assumes that, for instance, the 
same constant loss holds for all 'masters' whose true level of functioning is to the right of t c2 , 
no matter how large their distance from t c2 . It seems more realistic to suppose that for true 
masters the loss is a monotonically decreasing function of t (van der Linden, 1980). 

Moreover, the threshold loss function is discontinuous; at the criteria levels t cl and t c2 
this function "jumps" from one constant value to another. This sudden change seems 
unrealistic in many real-life decision making situations. In the neighborhood of these points, 
the losses for correct and incorrect decisions should change smoothly rather than abruptly 
(Davis et al., 1973). 

To overcome these shortcomings, van der Linden and Mellenbergh (1977) proposed a 
continuous loss function for the fixed-length two-action mastery problem which is a linear 
function of student's true level of functioning t (see also van der Linden & Vos, 1996; Vos, 
1990, 1991, 1995, 1997, 1998). For our sequential mastery problem, their linear loss function 
can be restated at each stage of sampling k (1 < k < n) as follows (see also Davis et al., 1973): 





b i(t-t cl ) + ke 
l(a i (x 1 ,...,x k ),t) = jb 2 (t-t c2 ) + ke 

iM^-O + ke 



for i = 1 
for i = 2 
for i = 3, 



( 1 ) 
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where b i? (b r b 2 ) > 0 (i = 1,2,3). 

At each stage of sampling k (1 < k < n), the above defined function consists for each 
action a i (x 1 ,...,x k ) (i = 1,2,3) of a constant term and a term proportional to the difference 
between the true level of functioning t and the specified criterion level t cl or t c2 . Analogous to 
the threshold loss function, the constant amounts of loss, e, associated with administering one 
random item are assumed to be equal for each action as well as for each sampling occasion. 
The condition b h b 2 , b 3 > 0 is equivalent to the statement that for action a,(x lf ...,x k ) and 
a 2 (xj,...,x k ), loss is assumed to be a strictly increasing function of t whereas loss for action 
a 3 (xj,...,x k ) is assumed to be strictly decreasing in t. Furthermore, the condition (b,-b 2 ) > 0 
states that the loss for action a,(X| 9 ... t x k ) increases more quickly in t than for action 
a 2 (x,,...,x k ). 

It should be noted that the linear loss function seems to be a realistic representation of 
the losses actually incurred in many decision making situations. In a recent empirical study, 
van der Gaag (1990) showed that for various real-life fixed-length mastery decisions in 
psychology and education the loss structures can be approximated satisfactory by linear 
functions. 

The loss parameters b r (i = 1 ,2,3) have to be assessed empirically again. 



To determine the optimal sequential number of items, a psychometric model to specify the 
statistical relation between the observed number-correct score and student's true level of 
functioning at each stage of sampling is needed. In the present paper, following Ferguson 
(1969a, 1969b), the well-known binomial model will be adopted. 

As indicated by van den Brink (1982), when tests are assumed sampled from item 
domains, as in our sequential four-action mastery problem, the well-known binomial model is 
a natural choice for estimating the distribution of student's number-correct score s k and 
making classification decisions (mastery, partial mastery, nonmastery). The binomial model 
assumes that the probability function relating the observed number-correct score s k (0 < s k < 
k) to student's true level of functioning t, f(s k 1 1), at stage k can be written as follows: 



Binomial Distribution as a Psychometric Model 




( 2 ) 



ERIC 
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Conditions Sufficient for Sequentially Setting Cutting Scores 

As far as the three mastery classification decisions concern, as noted earlier, we confine 
ourselves in this paper to monotone rules. The restriction to monotone rules, however, is 
correct only if it can be proven that for any nonmonotone rule for the problem at hand there is 
a monotone rule with at least the same value on the criterion of optimality used (Ferguson, 
1967, p.55). In a Bayesian fashion, the posterior expected loss is taken as the criterion of 
optimality. 

The posterior expected loss for continuing sampling is determined by averaging the 
posterior expected loss associated with each of the possible future decision outcomes relative 
to the probability of observing those outcomes (Lewis & Sheehan, 1990). Therefore, it follows 
immediately that the conditions sufficient for setting cutting scores for the fixed-length three- 
action mastery problem at each stage of sampling, are also sufficient for the sequential four- 
action mastery problem. Generally, conditions sufficient for setting cutting scores for the 
fixed-length multiple-decision problem are given in Ferguson (1967, p.286). 

First, the probability model relating observed number-correct score s k to student's true 
level of functioning t, f(s k 1 1), must have a monotone likelihood ratio (MLR); that is, it is 
required that for any t, > t 2 , the likelihood ratio f(s k | t,)/f(s k 1 1 2 ) is a nondecreasing function of 
s k . MLR implies that a high true level of functioning tends to coincide with a high observed 
number-correct score. Second, the condition of monotone loss must hold; that is, there must 
be an ordering of the actions such that for each pair of adjacent actions the loss functions have 
at most one point in which the difference between the losses changes sign. 

The condition of MLR holds for the binomial distribution, since this distribution 
belongs to the one-parameter exponential family which is well known to have MLR (e.g., 
Hogg & Craig, 1978). Generally, as shown by Gray (1988), for f(s k |t) to have MLR it is 
sufficient to show that the items have nondecreasing item characteristic functions. 

Assuming the indices reflect the proper ordening of the actions, it follows from Table 
1 that for threshold loss the condition of monotone loss is satisfied if at each stage of sampling 
k (1 < k < n): 



( 1 , 3 +ke) - (l 23 +ke) > (l 12 +ke) - ke >ke - (l 21 +ke), 
(l 23 +ke) - ke > ke - (l 32 +ke) > (l 21 +ke) - (l 31 +ke). 



( 3 ) 
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or, equivalently, 




( 4 ) 



Since (1) implies that [l(a 1 (x„...,x k ),t)-l(a 2 (x 1 ,...,x k ),t)] = [(b,-b 2 )t-b,t cl +b 2 t c2 ] and 



[l(a 2 (x 1 ,...,x k ),t)-l(a 3 (x 1 ,...,x k ),t)] = [(b 2 +b 3 )(t-t c2 )], it follows immediately from b v (b r b 2 ) > 0 
that the condition of monotone loss is also satisfied for linear loss at each stage of sampling k 
(1 < k < n). 



In this section, optimal cutting scores will be derived for the sequential four-action mastery 
problem. Doing so, first the posterior expected loss for the fixed-length three-action mastery 
problem will be minimized, given X, = x Iv ..,X k = x k (1 < k < n). In other words, for the fixed- 
length three-action mastery problem it will be determined which of the three actions 
a,(x |9 ... 9 x k ) 9 a 2 (Xj,...,x k ), or a 3 (xj,...,x k ) yields the smallest posterior expected loss, given an 
observed item response vector (Xj,...,x k ). Next, optimal rules for the sequential four-action 
mastery problem are computed at each stage of sampling k (1 < k < n) by comparing this 
smallest posterior expected with the posterior expected loss associated with action a 4 (Xj,...,x k ) 
(i.e., continuing sampling). 

Minimal Posterior Expected Loss for the Fixed-Length Mastery Problem 

In minimizing the posterior expected loss for the fixed-length three-action mastery problem, 
first the situation with linear loss will be elaborated. Next, the case of threshold loss will be 
examined. It will be assumed that the empirical data fits a beta distribution, which is used to 
represent prior knowledge about T. 



Optimal Rules for the Sequential Four-Action Mastery Problem 
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Appropriate Mastery Classification Decision with Linear Loss 

It can easily be verified from (1) that the decision rule minimizing the posterior expected loss 
in the case of linear loss, given Xj = x,,...,X k = x k (1 < k < n), is to declare mastery, 
a 3 (x,,...,x k ), when a student's number-correct score s k (0 < s k < k) is such that 

E[b 3 (t c2 -T)+ke I s kl ^ E[b 2 (T-t c2 )+ke | s k ], (5) 

Since (b 2 +b 3 ) > 0, this is equivalent to declare mastery if 

E(T | s k ) > t c2 , (6) 

where E(T | s k ) denotes the posterior expectation of T, given the observed number-correct 
score s k . 

If the inequality in (6) does not hold, a decision rule minimizing the posterior expected 
loss, given X, = x h ...,X k = x k (1 < k < n), is to declare partial mastery, a^x,,...^), if it holds 
for number-correct score s k that 

E[b 2 (T-t c2 )+ke I s k ] <E[b,(T-t cl )+ke | s k ], (7) 

and to declare nonmastery (a,(x,,...,x k )) to him/her otherwise. Since (b,-b 2 ) > 0, it follows that 
partial mastery is declared if 



E(T | s k ) > (b 1 t cl -b 2 t c2 )/(b,-b 2 ). 



( 8 ) 



and nonmastery is declared otherwise. 

Putting l(a 2 (x!,...,x k ),t) and l(a 3 (x,,...,x k ),t) equal to each other, it appears that the t- 
coordinate of the intersection of both loss lines, t c2 , is equal to the right-hand side of (6). 
Similarly, the t-coordinate of the intersection of l(a,(x h ...,x k ),t) and l(a 2 (Xj,...,x k ),t), say t 12 , is 
equal to the right-hand side of (8). 

Hence, with linear loss, the decision procedure for the fixed-length three-action 
mastery problem, given X L = X!,...,X k = x k (1 < k < n), can now be stated as follows: Mastery 
is declared to a student (a 3 (x h ...,x k )) if his/her posterior expectation of T is equal to or larger 
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than t c2 . If his/her posterior expectation of T is smaller than t c2 , however, the following two 
situations can be distinguished: First, his/her posterior expectation of T is smaller than t c2 but 
equal to or larger than t 12 . In this case, partial mastery is declared (a 2 (x,,...,x k )). Second, not 
only his/her posterior expectation of T is smaller than t c2 but also smaller than t, 2 . In this case, 
nonmastery is declared (a,(xj,...,x k )). 

In the present paper, prior knowledge about T will be estimated by using empirical 
data from other students of the group to which the individual student belongs (i.e., empirical 
Bayes approach). Here, it will be assumed that the empirical data fits a beta distribution, 
B(a,P). Its flexible form nearly always makes an approximation of prior beliefs possible 
(Novick & Jackson, 1974, p. 107-1 13). In the empirical example, it will be examined if this 
assumption holds against the data. 

Keats and Lord (1962) have shown that simple moment estimators of a and p, based 
upon the mean |i and the KR-21 reliability p of the observed number-correct score from other 
students of the group to which the student belongs, are given as 



where m denotes the number of items in the test from which p and (i are computed. 

It follows from an application of Bayes’ theorem that under the assumed binomial 
model from (2), the posterior distribution of T will again be a member of the beta family (the 
conjugacy property, see e.g., Lehmann, 1959). In fact, if the prior distribution is B(a,P) and 
student's observed number-correct score is s k from a test of length k (1 < k < n), then the 
posterior distribution is B(a+s k ,P+k-s k ). 

Using the fact that the expectation of a beta distribution B(a,P) is equal to a/(a+P), it 
follows that the posterior expectation of T can be written as (a+s k )/(a+s k +P+k-s k ), or, 
equivalently, 



Hence, with linear loss and a beta distribution representing prior knowledge about T, 
the optimal number of items for the fixed-length three-action mastery problem, given X, = 



ft = (-l + l/p)Jl, 

P = -6 + m/p-m 



( 9 ) 



E(T | s k ) = (a+s k )/(oc+P+k). 



( 10 ) 



ERfC 
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x j ,...,X k = x k (1 < k < n), can be computed by comparing the right-hand side of (10) with the 
right-hand sides of (6) and (8). 

As an aside, it may be noted that if no information is available from the group to 
which the individual student belongs, the parameters of the beta prior can be specified as a = 
P = 1. In that case, the prior distribution represents a uniform distribution on the standard 
interval from zero to one; hence, prior true level of functioning can take on all values between 
0 and 1 with equal probability. 

It is important to notice that if no empirical data is available for estimating prior true 
level of functioning, we are no longer dealing with an empirical Bayesian approach. Prior 
knowledge about T is estimated in this case by subjective assessment (e.g., Lewis & Sheehan, 
1990). 

Appropriate Mastery Classification Decision with Threshold Loss 

In the case of threshold loss, it can be seen from Table 1 that a decision rule minimizing the 
posterior expected loss, given X, = Xj,...,X k = x k (1 < k < n), is to declare mastery 
(a 3 (Xj,...,x k )) when a student's number-correct score s k (0 < s k < k) is such that 

— *cl I S k) + ^ 32 ^(^cl < T < t c2 I s k ) + ke < 

— *cl I s k) + ^23^(^ — *c2 I S k ) + ke. (11) 

Rearranging terms, it can easily be verified from (11) that mastery is declared if 

03.-l2.-l32)P(T ^ t cl | s k ) + (1 23 +1 32 )P(T > t c2 | s k )-l 31 +l 21 > 0. (12) 

If the inequality in (12) does not hold, a decision rule minimizing the posterior 
expected loss, given Xj = x,,...,X k = x k , is to declare partial mastery, a 2 (Xj,...,x k ), when a 
student's number-correct score s k (0 < s k < k) is such that 

1 2 |P(T < t cl | s k ) + 1 23 P(T > t c2 | s k ) + ke < 

•i2 p ( l ci < T < t c2 I S k ) + 1| 3 P(T > t c2 | s k ) + ke, (13) 



and to declare nonmastery (a,(x,,...,x k )) otherwise. It follows that partial mastery is declared if 
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(l 12 +l 21 )P(T>t cl | s k ) + (l 13 -l 23 -l, 2 )P(T>t c2 | s k ) - 1 21 > 0, 



(14) 



and nonmastery is declared otherwise. 

The cumulative posterior distributions P(T > t cl | s k ) and P(T > t c2 I s k ) in (12) and 
(14) of the beta prior have been extensively tabulated (e.g., Pearson, 1930). Normal 
approximations are also available (Johnson & Kotz, 1970, sect. 2.4.6). In general, if T has a 
beta distribution with parameters (a, (3) where neither a nor (3 is small (say, not < 10), then this 
distribution can be approximated by a normal distribution with mean 0(/(a+p) and variance 
ap/[(a4p) 2 (a+p+l)]. 

Minimizing Posterior Expected Loss for the Sequential Mastery Problem 

Since the action a 4 (x,,...,x n ) (i.e., continuing sampling) is not available at the final stage of 
sampling, n, the action a,(x lv ..,x n ), a 2 (x!,...,x n ), or a 3 (x lv ..,x n ) with the smallest posterior 
expected loss also represents the optimal sequential rule at the final stage of sampling. 
Optimal sequential rules at the other stages of sampling k (i.e., 1 < k < n) are computed by 
comparing the smallest posterior expected loss of the three actions a^Xj,...^), 

and a 3 (xj,...,x k ) with the posterior expected loss of action a 4 (x,,...,x k ). As noted before, the 
posterior expected loss associated with continuing sampling is determined by considering all 
possible future decision outcomes (i.e., backward induction). Hence, the following backward 
induction computational scheme can be used for determining the optimal sequential rules for 
our four-action mastery problem: 

Suppose that Xj = x,,...,X n = x n has been observed at the final stage of sampling, n. 
Then, it is first computed which of the three actions aj(x,,...,x n ), a 2 (x,,...,x n ), and a 3 (Xj,...,x n ) 
yields the smallest posterior expected loss at the final stage of sampling. Let this optimal 
action be denoted as cp n (x !s ...,x n ) and its associated minimum posterior expected loss as 



Generally, the action a,(x l5 ...,x k ), a 2 (Xj,...,x k ), or a 3 (x,,...,x k ) yielding the smallest 
posterior expected loss, given X, = Xj,...,X k = x k (1 < k < n), is denoted as cp k (Xj,...,x k ) and its 
associated minimum posterior expected loss as V k (x,,...,x k ). If no observation has been taken 
yet, %(x 0 ) and V 0 (x 0 ) denote the action a^Xo), a 2 (x 0 ), or a 3 (x 0 ) which yields the smallest prior 
expected loss and its associated minimum prior expected loss, respectively. 




ERIC 
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Next, (P n _i(x 1 ,...,x n _ 1 ) and V^x,....^,) are computed at stage (n-1) of sampling. At 
this stage of sampling, however, we must also take into account the possible action of 
continuing sampling, a 4 (x,,...,x k ). Hence, V^x,,...^,,) at stage (n-1) must be compared with 
the posterior expected loss associated with continuing sampling. At stage (n-1) of sampling, 
the posterior expected loss associated with taking one more observation, E[V n (x,,...,x n _,,X n ) | 
X, = x, ,...,X n _, = x n _,], is computed as follows: 



E[V n (x„...,x n .„X n ) | X, = x, X n _, = x n _|] = 

£v n (x„...,x n )*P(X n | X, = x X n _| = x n _|), 



x n =0 



(15) 



where P(X n I X, = x,,...,X n ,, = x n _,) denotes the conditional distribution of X n , given the 
observed item response vector (x,,...,x n _,). This is also called the posterior predictive 
distribution of X n at stage (n-1) of sampling. In the next section it will be indicated how, 
generally, the posterior predictive distribution of X k (1 < k < n), given the observed item 
response vector (x,,...,x k _|), can be computed. Note that (15) averages the posterior expected 
loss associated with each of the possible future decison outcomes relative to the probability of 
observing those outcomes (Lewis & Sheehan, 1990). 

Following Lewis and Sheehan (1990), the minimum conditional Bayes risk at stage (n- 
1) of sampling, given X! = x,,...,X n _, = x n _,, is defined as: 



R n .,(x,,...,x n .,) = min{V n . 1 (x l ,...,x n . l ), E[V n (x„...,x n .„X n ) | X, = x,,...,X n ., = x n .,]}.(16) 



Let the optimal rule for the sequential four- action mastery problem at stage (k-1) (1 < k < n), 
given X, = x,,...,X k _, = x k _,, be defined as d k _,(x,,...,x k _,), where d 0 (x 0 ) denotes the decision 
whether or not to take at least one observation. Then, d n _,(x,,...,x n _|) can be obtained by 
comparing V n .,(x„...,x n .,) and E[V n (x,,...,x n .„X n ) | X, = x„..., X n _, = x n J with each other. 
Hence, it follows: 



d n-|( X |.-.X„-l) = 

U n _ I (X I ,...,X n . I ) if R n-| ( x )>•••- X n _|) = V„_,((X X n _, ) 

[continue sampling ifR n .,(x = E[v n ((x 1 ,...,x n _ 1 ,X n ) I X, =x 1 ,...,X n . I 



(17) 

X n-J- 
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In the case of equality between V n . 1 (x„...,x n . 1 ) and E[V n (x 1> ...,x n . 1) X n ) | X, = x,,...^., = x n J 
it does not matter whether or not the decision-maker takes one more observation. 

Let the minimum conditional Bayes risk at stage n of sampling, given = Xj,...,X n = 
x n , be defined as V n (Xj,...,x n ). Then, generally, the minimum conditional Bayes risk at stage 
(k-1), given = x 1 ,...,X k _ 1 = x k _j, is computed inductively as a function of the minimum 
conditional Bayes risk at stage k (1 < k < n) as follows: 

^■k-i ( x i »***» x k-i) — ^nin {^k-i(Xiv^ k .i), E[R k (xi,...,x k _j,X k ) J = x,,.. M X k ., = x k _ i ] } , (18) 

where the posterior expected loss associated with taking one more observation at stage (k-1) 
of sampling, E[R k (x 1 ,...,x k . 1 ,X k ) \ X { = x h ...,X k _, = x k _J, is computed as follows ( 1 < k < n): 

E[R k ( x i,.--, x k-i*X k ) | X| = X| 9 ... v X k .| = x k _j] = 

x k =l 

^R k ( x i x k ) *P(X k I Xj = X| 9 ...,X k .| = x^j). (19) 

x k =0 

Analogous to the computation at stage (n-1), <p n . 2 (Xi,...,x n . 2 ) and V n . 2 (x 1 ,.,.,x n . 2 ) are 
now computed at stage (n-2) of sampling. Next, using ( 1 8)-( 1 9), E[R n . 1 (x 1 ,...,x n _ 2 ,X n , 1 ) | X { = 
x j ,...,X n ^ 2 = x n . 2 ] and R n . 2 (x 1 ,...,x n . 2 ) are computed at stage (n-2). Finally, analogous to the 
computation of d„. 1 (x 1 ,...,x n _|), d n _ 2 (Xj,...,x n _ 2 ) is computed at stage (n-2) by comparing 
V n _ 2 (Xi,..., x n _ 2 ) and E[R n _ 1 (x 1 ,...,x n . 2 ,X n . 1 ) | X, = x,,...,X n _ 2 = x n _ 2 ] with each other. Following 
the same computational backward scheme, d n _ 3 (Xj,...,x n _ 3 ),...,d 0 (x 0 ) are computed. 

Computation of Posterior Predictive Distribution 

In this section, it will be indicated how the posterior predictive distribution P(X k | X { = 
x j,...,X k _ j = x k .j) in ( 19) can be computed (1 < k < n). From Bayes' theorem, it follows that: 

P(Xk I x, = x, X k ., = x M ) = P(X, = x,„..,X k = x k )/P(X, = x, X k ., = x k .,). (20) 

Since the binomial model was adopted for the psychometric model involved, it follows from 
(2) that 
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P(X, = x, ,...,x k = x k lt) = t Sk (1 - t) k - Sk . (21) 

Furthermore, the p.d.f. of T was assumed to be distributed according to a beta distribution 
B(a,P) with parameters a and P (a, P > 0) in the standard interval [0,1]: 

p(o = [na+p^Ki-op-'i/tnajnp)], (22) 

where T is the usual gamma function. 

Multiplying (21) and (22) and integrating out t yields the unconditional distribution of 
(X l9 ... f X k ): 

P(X| — Xj,...,X k = x k ) = 

[r(a+P)r(a+s k )np+k-s k )]/[r(a)r(P)r(a+p+k)]. (23) 

Similarly, the unconditional distribution of (Xj ,...,X k _ j ) is equal to: 

P(Xj = x lf ...,X k _| = x k .j ) = 

[r(a+P)r(a+s k _,)r(P+k- 1 -s k J]/[r(a)r(P)r(a+p+k- 1 )]. (24) 

It now follows from (20), (23), and (24) that the posterior predictive distribution of X k , given 
the observed item response vector (x, ,...,x k _ 1 ), can be written as: 

P(X k | Xj = x i ,...,X k j = x k .j) = 

[r(a+s k )r(P+k-s k )r(a+P+k - 1 )]/[r(a+s k .!)r(P+k- 1 -s k . i)r(a+p+k)]. (25) 

Since s k = s k _, and s k = s k _j-hl for x k = 0 and 1, respectively, and using the well-known 
identity r(j+l) = jF(j), it finally follows from (25) that: 



P(X k | Xj - Xj,...,X k _, - x k .j) 



f(P — s k _, + k-l)/(a + P + k-l) if x k =0 
{(a + s k _,)/(a + P + k- 1) if x k = 1. 



( 26 ) 




20 
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An Empirical Example 

The procedures for computing the optimal sequential number of items were applied to a 
computerized four-action mastery test for concept-learning in medicine for freshmen. 
Concept-learning is the process in which subjects learn to categorize objects, processes or 
events, for instance, formation of diagnostic skills in medicine or psychology (see Tennyson 
and Cocchiarella, 1986, for a complete review of the theory of concept-learning). 

Information from the group to which the student belongs was available in the form of 
data from a pretest for a sample of 76 freshmen in a medical program. The pretest consisted of 
30 multiple-choice items and had possible test scores ranging from 0-30. The mean and KR- 
21 reliability coefficient were estimated as 16 and 0.81, respectively. Hence, it follows from 
(9) that a and P were estimated as 3.75 and 3.28, respectively. 

The fit of the pretest data to the binomial model with the assumed beta distribution for 
prior true level of functioning was checked by comparing the theoretical score distribution 
with the empirical observed score distribution. Keats and Lord (1962) have shown that the 
theoretical score distribution is the negative hypergeometric distribution. The results of the 
chi-square test showed a satisfactory fit at a significance level of 0.05. 

The instructors of the program considered students as having mastered the present 
concept successfully if they had mastered at least 60% of the total number of items covering 
the subject matter of that concept (i.e., true mastery). Therefore, t c2 was fixed at 0.6. 
Furthermore, nonmastery was declared if students had mastered less than 50% of the total 
number of items covering the subject matter of the present concept (i.e., true nonmastery). 
Therefore, t cl was fixed at 0.5. 

Finally, the constant cost for administering one random item was assumed to be rather 
small. Therefore, the value of e was set equal to 0.01. 

Results with Linear Loss and a Beta Prior for T 

First, the case of linear loss and a beta distribution for prior knowledge about T is considered. 
Taking into account the requirements discussed earlier, the loss parameters were empirically 
assessed by the instructors of the program yielding the following result: b, = 8, b 2 = 3, and b 3 
= 1 . For these values of the loss parameters, the right-hand side of (8) turned out to be equal to 
0.44. 
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The appropriate action (i.e., nonmastery, partial mastery, mastery, or continue 
sampling) is depicted in Table 2 as a closed interval for a maximum of 30 items (i.e., n = 30) 
at each stage of sampling k (0 < k < n) for different number correct score s k (0 < s k < k). 

Table 2 has been constructed by applying the following backward induction 
computational scheme. First, the appropriate action and its associated minimum posterior 
expected loss at the final stage of sampling have been determined; that is, (p 30 (Xj,...,x 30 ) and 
V 30 (x,,..,x 30 ), have been computed for s 30 = 0,...,30. More specifically, nonmastery was 
declared for those values of s 30 for which E(T | s 30 ) < 0.44, partial mastery was declared for 
those values of s 30 for which 0.44 < E(T | s 30 ) < 0.6, and mastery was declared for those values 
of s 30 for which E(T | s 30 ) > 0.6. Note that it can be inferred from Table 2 that the cutting 
scores s cl (30) and s c2 (30) are equal to 12 and 19, respectively. 

Similarly, the appropriate action nonmastery, partial mastery, or mastery and its 
associated minimum posterior expected loss have been computed after 29 items for s 29 = 
0,...,29 (i.e., <p 29 (x p ...,x 29 ) and V 29 (Xj,...,x 29 )). Next, using (18), (19), (26), and the minimum 
posterior expected losses calculated at the final stage of sampling, the posterior expected loss 
associated with taking one more observation at stage 29 of sampling is computed for s 29 = 
0,...,29 (i.e., E[V 30 (x,,...,x 29 ,X 30 ) | X, = x h ...,X 29 = x 29 ]). These values are compared to the 
minimum posterior expected losses after stopping after 29 items in order to compute the 
conditional Bayes risk at stage 29 of sampling. Using (17), the appropriate action nonmastery, 
partial mastery, mastery, or continue sampling is determined at stage 29. Similarly, the 
appropriate action is determined at stage 28 until stage 0 of sampling. A computer program 
called LINEAR was developed to determine the appropriate action at each stage of sampling. 
A copy of the program LINEAR is available from the author upon request. 

As can be seen from Table 2, regardless of the observed number-correct score s k , the 
decision-maker takes at least five observations. Furthermore, Table 2 shows that a student 
whose posterior expectation of T is in the region of the intersection of the loss lines 
l(a,(x,,...,x k ),t) and l(a 2 (x,,...,x k ),t) or in the intersection of the loss lines l(a 2 (Xj,...,x k ),t) and 
l(a 3 (x,,...,x k ),t), it is hard to classify him /her as a nonmaster, partial master, or master. Hence, 
longer tests are needed for such students. On the other side, shorter tests can be provided for 
students whose posterior expectation of T is not in the region of the intersection of these loss 
lines. 
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Table 2. Appropriate action calculated by stage of sampling and number-correct in case of 
linear loss. 



Stage of sampling 




Appropriate Action by Number-Correct 






Nonmastery 


Continue 


Partial 

Mastery 


Continue 


Mastery 


0 




0 








1 




[0,1] 








2 




[0,2] 








3 




[0,3] 








4 




[0,4] 








5 




[0,4] 






5 


6 


0 


[1,5] 






6 


7 


0 


[1,5] 






[6,7] 


8 


0 


[1,6] 






[7,8] 


9 


[0,1] 


[2,7] 






[9,0] 


10 


[0,1] 


[2,7] 






[8,10] 


11 


[0,2] 


[3,8] 






[9,11] 


12 


[0,2] 


[3,8] 






[9,12] 


13 


[0,3] 


[4,6] 


7 


[8,9] 


[10,13] 


14 


[0,3] 


[4,9] 






[10,14] 


15 


[0,4] 


[5,7] 


8 


[9,10] 


[11,15] 


16 


[0,4] 


[5,7] 


[8,9] 


[10,11] 


[12,16] 


17 


[0,5] 


[6,8] 


9 


[10,11] 


[12,17] 


18 


[0,5] 


[6,8] 


[9,10] 


[11,12] 


[13,18] 


19 


[0,6] 


[7,9] 


10 


[11,12] 


[13,19] 


20 


[0,6] 


[7,9] 


[10,11] 


[12,13] 


[14,20] 


21 


[0,7] 


[8,9] 


[10,12] 


13 


[14,22] 


22 


[0,7] 


[8,10] 


[11,12] 


[13,14] 


[15,22] 


23 


[0,8] 


[9,10] 


[11,13] 


[14,15] 


[16,23] 


24 


[0,8] 


[9,11] 


[12,14] 


15 


[16,24] 


25 


[0,9] 


[10,11] 


[12,14] 


[15,16] 


[17,25] 


26 


[0,9] 


[10,11] 


[12,15] 


16 


[17,26] 


27 


[0,10] 


[11,12] 


[13,15] 


[16,17] 


[18,27] 


28 


[0,10] 


[11,12] 


[13,16] 


17 


[18,28] 


29 


[0,11] 


12 


[13,17] 


18 


[19,29] 


30 


[0,12] 




[13,18] 




[19,30] 



Finally, it can be inferred from Table 2 that with increasing number of items being 
administered, the chances of being classified as a nonmaster, partial master, or master 
increases. 

Let us assume that the sequential decision procedure starts with administering one 
ramdomly selected item and stops after declaring nonmastery, partial mastery, or mastery. 
Hence, the sequential decision procedure proceeds only after the continue sampling decision, 
^hen, it can easily be inferred from Table 2 that the optimal sequential rule can be depicted in 

EMC 23 
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Table 3 at each stage of sampling k (1 < k < 30) for different number-correct score s k (0 < s k < 
k) as follows: 



Table 3. Optimal sequential rule calculated by stage of sampling and number-correct in case 
of linear loss. 



Stage of sampling 




Optimal Sequential Rule by Number-Correct 






Nonmastery 


Continue 


Partial 

Mastery 


Continue 


Mastery 


1 




[0,1] 








2 




[0,2] 








3 




[0,3] 








4 




[0,4] 








5 




[0,4] 






5 


6 


0 


[1,5] 








7 




[1,5] 






6 


8 




[1,6] 








9 


1 


[2,7] 








10 




[2,7] 






8 


11 


2 


[3,8] 








12 




[3,8] 






9 


13 


3 


[4,6] 


7 


[8,9] 




14 




[4,9] 






10 


15 


4 


[5,7] 


8 


[9,10] 




16 




[5,7] 


[8,9] 


[10,11] 




17 


5 


[6,8] 




[10,11] 


12 


18 




[6,8] 


[9,10] 


[11,12] 




19 


6 


[7,9] 




[11,12] 


13 


20 




[7,9] 


[10,11] 


[12,13] 




21 


7 


[8,9] 


10 or 12 


13 


14 


22 




[8,10] 




[13,14] 




23 


8 


[9,10] 


11 or 13 


[14,15] 




24 




[9,11] 


14 


15 


16 


25 


9 


[10,11] 


12 


[15,16] 




26 




[10,11] 


12 or 15 


16 


17 


27 


10 


[11,12] 




[16,17] 




28 




[11,12] 


13 or 16 


17 


18 


29 


11 


12 


13 or 17 


18 




30 


12 




13 or 18 




19 



Note that not all possible number-correct scores s k are necessarily present at each stage 
of sampling k, because it is assumed in Table 3 that the optimal sequential rule stops after 
declaring nonmastery, partial mastery, or mastery. For instance, the number-correct score s 6 




24 
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can only take the values 0 until 5, and thus, not the value 6. This is because mastery was 
declared for s 5 = 5, implying the optimal sequential rule stops for this value of s 5 . 

Results with Threshold Loss and a Beta Prior for T 

Next, the case of threshold loss and a beta prior for T is considered. Taking into account the 
requirements discussed earlier and assuming equal losses for the correct decisions l n , 1 22 , and 
1 33 , the losses from Table 1 were empirically assessed by the instructors of the program 
yielding the following result: 



Table 4. Threshold loss table at stage k (1 < k < n) of sampling for empirical example . 



^^True Level 








Action 


T < tci 


tel < T < tc2 


T > tci 


ai(xi, x k ) 


ke 


4 + ke 


7 + ke 


a 2 (xi, .... x k ) 


1 + ke 


ke 


2 + ke 


a 3 (xi, x k ) 


3 + ke 


1 + ke 


ke 



Note that 1 23 was assessed larger than l 2l for this specific empirical example. Using the 
numerical values for the loss parameters 1- (i,j = 1,2,3) of Table 4, the appropriate action 
nonmastery, partial mastery, mastery, or continue sampling is depicted in Table 5 for a 
maximum of 30 items at each stage of sampling k (0 < k < n) for different number correct 
score s k (0 < s k < k) as a closed interval again. 

Table 5 was constructed by using the same backward induction computational scheme 
as in the construction of Table 2. Doing so, the appropriate action at each stage of sampling k 
(0 < k < n) for the fixed-length three-action mastery problem (i.e., (p k (x,,...,x k )) was 
determined by examining if the inequalities in (12) and (14) were satisfied. More specifically, 
nonmastery was declared for those values of s k (0 < s k < k) for which the left-hand sides of 
(14) were equal to or smaller than zero, partial mastery was declared for those values of s k for 
which the left-hand sides of (14) and (12) were larger and smaller than zero, respectively, and 
mastery was declared for those values of s k for which the left-hand sides of (12) were equal to 
or larger than zero. Using numerical procedures for calculating the cumulative posterior 
distributions P(T > t c , | s k ) and P(T > t c2 | s k ), a computer program called THRESHOLD was 
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developed to determine the appropriate action. A copy of the program THRESHOLD is 
available from the author upon request. 



Table 5. Appropriate action calculated by stage of sampling and number-correct in case of 
linear loss. 



Stage of sampling 




Appropriate 


Action by Number-Correct 






Nonmastery 


Continue 


Partial 

Mastery 


Continue 


Mastery 


0 




0 








1 




[0,1] 








2 




[0,2] 








3 




[0,3] 








4 




[0,4] 








5 




[0,5] 








6 




[0,5] 






6 


7 




[0,6] 






7 


8 




[0,7] 






8 


9 


0 


[1,7] 






[8,9] 


10 


0 


[1.8] 






[9,10] 


1 1 


0 


[1.8] 






[9,11] 


12 


[0,1] 


[2,9] 






[10,12] 


13 


[0,1] 


[2,10] 






[11,13] 


14 


[0,2] 


[3,10] 






[11,14] 


15 


[0,2] 


[3,11] 






[12,15] 


16 


[0,3] 


[4,11] 






[12,16] 


17 


[0,3] 


[4,12] 






[13,17] 


18 


[0,4] 


[5,12] 






[13,18] 


19 


[0,4] 


[5,13] 






[14,19] 


20 


[0,5] 


[6,13] 






[14,20] 


21 


[0,5] 


[6,14] 






[15,21] 


22 


[0,6] 


[7,14] 






[15,22] 


23 


[0,6] 


[7,15] 






[16,23] 


24 


[0,7] 


[8,11] 


12 


[13,15] 


[16,24] 


25 


[0,8] 


[9,11] 


[12,13] 


[14,16] 


[17,25] 


26 


[0,8] 


[9,11] 


[12,13] 


[14,16] 


[17,26] 


27 


[0,9] 


[10,12] 


[13,14] 


[15,17] 


[18,27] 


28 


[0,10] 


[11,12] 


[13,15] 


[16,17] 


[18,28] 


29 


[0,11] 


12 


[13,16] 


17 


[18,29] 


30 


[0,12] 




[13,17] 




[18,30] 



As can be seen from Table 5, analogous to the situation with linear loss, the decision- 
maker takes at least five observations. Furthermore, Table 5 shows that continue sampling 
decisions in the region between the actions partial mastery and mastery are taken for the first 
q "me after 23 items have been administered. Continue sampling decisions in the region 
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between the actions nonmastery and partial mastery, however, are taken already after 8 items 
have been administered. 

A possible explanation for this finding might be that the losses associated with taking 
false nonmastery decisions are rather large relative to the losses associated with taking false 
partial mastery decisions (i.e., 4 and 7 relative to 1 and 2), whereas the losses associated with 
taking false partial mastery and mastery decisions (i.e., 1 and 2 relative to 3 and 1) do not 

Table 6. Optimal sequential rule calculated by stage of sampling and number-correct in case 
of threshold loss. 


Stage of sampling 


Optimal Sequential Rule by Number-Correct 




Nonmastery Continue 


Partial Continue Mastery 








Mastery 


1 




[0,1] 




2 




[0,2] 




3 




[0,3] 




4 




[0,4] 




5 




[0,5] 




6 




[0,5] 


6 


7 




[0,6] 




8 




[0,7] 




9 


0 


[1,7] 


8 


10 




[1,8] 




11 




[1,8] 


9 


12 


1 


[2,9] 




13 




[2,10] 




14 


2 


[3,10] 


11 


15 




[3,11] 




16 


3 


[4,11] 


12 


17 




[4,12] 




18 


4 


[5,12] 


13 


19 




[5,13] 




20 


5 


[6,13] 


14 


21 




[6,14] 




22 


6 


[7,14] 


15 


23 




[7,15] 




24 


7 


[8,11] 


12 [13,15] 16 


25 


8 


[9,11] 


[12,13] [14,16] 


26 




[9,11] 


12 [14,16] 17 


27 


9 


[10,12] 


14 [15,17] 


28 


10 


[11,12] 


1 3 or 15 [16,17] 18 


29 


11 


12 


13 or 16 17 18 


30 


12 




1 3 or 17 18 
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differ that much. Consequently, it seems better to continue sampling in the region between the 
actions nonmastery and partial mastery in order to avoid relatively large losses associated with 
taking false decisions. 

Finally, analogous to the construction of Table 3 from Table 2, the optimal sequential 
rule can be inferred from Table 5 at each stage of sampling k (1 < k < 30) and for different 
number-correct score s k (0 < s k < k) again. The result is depicted in Table 6. 

Conclusions and Some New Lines of Research 

In this paper, using the framework of empirical Bayesian decision theory, optimal sequential 
rules for the four-action mastery problem (nonmastery, partial mastery, mastery, and 
continuing sampling) were derived. The procedures were demonstrated by an empirical 
example for concept learning in medicine. Both for threshold and linear loss, optimal 
sequential rules were derived with prior knowledge assumed to be represented by a beta 
distribution. 

The results indicated that, regardless of the observed number-correct score, the 
decision-maker takes at least five observations for both loss structures. Furthermore, it turned 
out that the chances of being classified as a nonmaster, partial master, or master increased if 
the number of items administered increased. This result was in accordance with our 
expectations. 

There are a few new lines of research arising from the application of (empirical) 
Bayesian decision theory to sequential mastery testing. The first is the extension of 
determining the optimal sequential decision rules to the case that, in addition to the actions 
nonmastery, partial mastery, mastery, and administer randomly one more item, still another 
action is open to the decision-maker (e.g., mastery with distinction). Following the same line 
of reasoning as in the situation where there are four actions open to the decision-maker, the 
optimal sequential rules can easily be generalized to this sequential five-action mastery 
problem. 

Two, it might be assumed that guessing and carelessness have to be taken into 
account. Morgan (1979) has developed a model with corrections for guessing and carelessness 
within a Bayesian decision-theoretic framework (see also van den Brink & Koele, 1980). The 
results of a computer simulation of the model indicate that guessing and carelessness may 
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markedly affect the determination of cutting scores, and hence the accuracy of the sequential 
decision procedures. 

Third, it might also be assumed that no prior knowledge about true level of functioning 
is available. In these circumstances, the maximin procedure might be an appropriate 
framework (e.g., Huynh, 1980; Veldhuyzen, 1982), which requires no prior distribution 
regarding true level of functioning. As an aside, it might be noted that a maximin rule can be 
conceived as a rule that is based on minimization of posterior expected loss as well, but under 
the restriction that the prior is the least favorable of the class of priors (e.g., Ferguson, 1967, 
Sect. 1.6). 

The last line is research into other prior distributions, psychometric models (e.g., 
standard-normal model), and loss structures than the ones assumed here. For example, the 
normal ogive function (Novick & Lindley, 1979) which takes loss to be a nonlinearly function 
of the true level of functioning, might be a realistic representation of the losses actually 
incurred. This loss function does not only have realistic properties but also can be combined 
nicely with a standard normal distribution for the psychometric model. 
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